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ABSTRACT 

We have measured the angular correlation function w{6) for a sample of 871 
Lyman-break galaxies (LBGs) in five fields at redshift z ~ 3. Fitting the power-law 
A W 6~P to a weighted average of w{9) from the five fields over the range 12 Zz 9 iz 330 
arcsec, we find A w ~ 2 arcsec' 3 and f3 ~ 0.9. The slope is, within the errors, the 
same as for galaxy samples in the local and intermediate redshift universe, and a 
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slope (3 = 0.25 or shallower is ruled out by the data at the 99.9% confidence level. 
Because N(z) of LBGs is well determined from 376 spectroscopic LBG redshifts, the 
real-space correlation function can be accurately derived from the angular one through 
the Limber transform. The inversion of w(9) is rather insensitive to the still relatively 
large uncertainties on and (3, and the spatial correlation length is much more 
tightly constrained than either of these parameters. We estimate ro = 3.3^06 (2.liof) 
/i _1 Mpc (comoving) for go = 0.1 (0.5) at the median redshift of the survey, z = 3.04 
(h is in units of 100 km s _1 Mpc -1 throughout this paper). The observed comoving 
correlation length of LBGs at z ~ 3 is comparable to that of present-day spiral galaxies 
and is only ~ 50% smaller than that of present-day ellipticals; it is as large or larger 
than any measured in recent intermediate-redshift galaxy samples (0.3 ^ z ^ 1). By 
comparing the observed galaxy correlation length to that of the mass predicted from 
CDM theory, we estimate a linear bias for LBGs of b ~ 1.5 (4.5) for q = 0.1 (0.5), 
in broad agreement with our previous estimates based on preliminary spectroscopy. 
The strong clustering and the large bias of the LBGs are consistent with biased galaxy 
formation theories and provide additional evidence that these systems are associated 
with massive dark matter halos. The results of the clustering of LBGs at z ~ 3 
emphasize that apparent evolution in the clustering properties of galaxies may be due 
as much to variations in effective light-to-mass bias parameter among different galaxy 
samples as to evolution in the mass distribution through gravitational instability. 

Subject headings: cosmology: observations — galaxies: formation — galaxies: evolution 
— galaxies: distances and redshifts 

1. INTRODUCTION 

In most cosmological models, galaxies are expected to be biased tracers of the underlying 
mass-density field, with the level of light-to-mass bias being a function of the galaxy mass; more 
massive galaxies would tend to populate volumes of space with a higher overall mass density, and, 
as a consequence, would be characterized by a stronger spatial clustering than less massive systems 
(e.g. Kaiser 1984; Mo & White 1996). Furthermore, the bias of galaxies respect to the mass is 
expected to evolve with cosmic time as a result of gravitational growth of density perturbation 
and hierarchical merging (Matarrese et al. 1997; Mann, Peacock & Heavens 1997; Bagla 1997). 
Empirically, it has been known for some time that different types of galaxies do indeed cluster 
differently; numerous large galaxy redshift surveys (Davis et al. 1988; Hamilton 1988; Santiago & 
Da Costa 1990; Loveday et al. 1995; Tucker et al. 1996; Valotto & Lambdas 1997) in the local 
universe have shown that early- type galaxies (E/S0) are more strongly clustered than later types 
(Sp/Irr), with a two-point correlation function that is generally steeper and a correlation length 
~ 2 times larger. A similar trend with the absolute luminosities of the galaxies has also been 
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observed (Park et al. 1994). 

In the past few years, several deep redshift surveys have probed field galaxies in the 
intermediate-redshift universe (e.g. Lilly et al. 1995; Cowie, Hu & Songaila 1995). These find 
similar clustering segregation as in the local universe, with the redder and more luminous systems 
more strongly clustered than their bluer and fainter counterparts (Le Fevre et al. 1996; Carlberg 
et al. 1997), and detect apparent evolution in galaxy clustering, with a comoving correlation 
length ro that is three times smaller at z ~ 1 than in local samples. If the galaxies in these 
samples at different redshifts all had the same bias with respect to the mass distribution, then the 
observed differences in galaxy clustering trace the evolution of mass clustering, and could be used 
to constrain cosmology; however, it is difficult to understand the mix of galaxy masses included 
in magnitude limited surveys as a function of redshift. One might hope that a sample's bias 
would not depend strongly on how it was selected, but if this were the case then different redshift 
surveys would currently be in quantitative disagreement with each other (Carlberg et al. 1997). It 
seems likely, then, that a sample-dependent (both because of redshift effects and selection criteria) 
light-to-mass bias could be at least partly responsible for the observed "evolution" of galaxy 
clustering with redshift, and in this case it would be difficult to draw cosmological conclusions 
from those surveys. 

Recently, it has become possible to identify large numbers of galaxies in a narrow redshift 
range using photometric techniques (e.g. Connolly et al. 1995, Steidel et al. 1996a, Madau et 
al. 1996). In contrast to traditional magnitude-limited surveys, which contain a wide range of 
galaxies over a large interval of time, and likely different mixtures of galaxies at different redshifts, 
a sample selected in this way provides a snapshot of the locations of similar galaxies over a small 
span of time. As a result, the observed clustering is much easier to interpret. An example of a 
photometric redshift technique is the Lyman-break technique (Steidel & Hamilton 1993, Steidel, 
Pettini, & Hamilton 1995, Steidel et al. 1996a, Giavalisco et al. 1996, Madau et al. 1996) which 
selects the brightest star- forming (and relatively dust-free) galaxies at high redshift. It is still 
unclear what the lower redshift counterparts of these Lyman-break galaxies would be, and so one 
cannot easily draw cosmological conclusions by comparing the clustering strength of Lyman-break 
galaxies to the clustering strength of a lower-redshift sample; but we can use the sample for the 
more modest goal of constraining theories of galaxy formation. In particular there is a great deal 
of fruitful work to be done comparing the properties of this well-defined high-redshift sample 
with the predictions of numerical and semi-analytic models (Baugh et al. 1998, Jing & Suto 1998, 
Governato et al. 1998). 

In a previous paper (Steidel et al. 1998, Paper 1) we described a large concentration of LBGs 
in redshift space discovered in one of our survey fields, and argued that such a concentration would 
not exist in standard CDM cosmogonies unless LBGs were very biased tracers of mass. In the 
present paper, we present a complementary angular clustering analysis of the LBG candidates in 5 
of our survey fields, which can be used in conjunction with the spectroscopic redshift distribution 
to estimate the spatial correlation function of the Lyman-break population at z ~ 3. Again, we 
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will find that these galaxies are much more strongly clustered than the mass would be according to 
models of hierarchical structure formation. Such a strong clustering of forming galaxies is actually 
in agreement with predictions of models in which LBGs are associated with relatively rare and 
massive dark matter halos (e.g., Baugh et al. 1997, Mo & Fukugita 1996, Jing & Suto 1998). 

2. LYMAN-BREAK GALAXIES 

The Lyman-break technique uses color selection to identify high-redshift galaxies through 
multi-band imaging across the 912 A Lyman-continuum discontinuity. Details of the technique 
have been presented elsewhere (e.g. Steidel & Hamilton 1993; Steidel, Pettini & Hamilton 1995; 
Madau et al. 1996) and here we only briefly review them. At z £ 2.5 the Lyman limit is redshifted 
far enough into the optical window to be observable in broad-band ground-based photometry. By 
placing filters on either side of the redshifted Lyman limit one can find high-redshift objects by 
their strong spectral breaks. In our implementation of the technique we use a custom photometric 
system, U n GlZ (Steidel & Hamilton 1993) optimized for selecting LBGs with z ~ 3. An initial 
selection region of the [G — 1Z, U n — G] plane was chosen based on the expected colors of moderately 
unreddened star-forming galaxies computed using stellar population synthesis codes (Bruzual & 
Chariot 1996), and including the effects of the opacity of interstellar gas and intervening absorption 
by H I (Steidel et al. 1995; see also Madau 1995). Our selection criteria were subsequently verified 
and refined after extensive spectroscopy with the Low Resolution Imaging Spectrograph (Oke et 
al. 1995) on the W.M. Keck telescope. 

In this paper an object is considered a Lyman-break galaxy if its colors satisfy 

(U n — G)> 1.0 + (G — TZ); (U n — G)> 1.6; (G-ft)<1.2, (1) 

with an additional requirement 1Z < 25.5 imposed to produce a reasonably complete sample that 
is suitable for spectroscopic follow-up. Magnitudes are in the AB scale (Oke & Gunn, 1983). 
We have found that at least 75% of the objects meeting these criteria are indeed high-redshift 
galaxies. Figure 1 shows the redshift distribution N{z) for all 376 spectroscopically identified 
galaxies selected with these criteria. The median redshift is z = 3.04 and the standard deviation 
is a z = 0.27; approximately 90% of the galaxies have 2.6 ^ z ^ 3.4, and none have z < 2.2. About 
5% of the objects meeting these criteria are stars; almost all of these are brighter than 1Z ~ 24. 
The remaining 20% of objects meeting these criteria have not been identified because of low S/N. 
Our success in obtaining a redshift has no obvious dependence on luminosity or color. 

For the purpose of measuring angular clustering we have restricted our sample to candidates 
from the larger and deeper fields of our survey, whose salient features are summarized in Table 1. 
While each of the fields will be treated independently in the analysis below, the surface density of 
faint galaxies, and the median colors for all detected objects in each field, are consistent with one 
another after correction for Galactic reddening. Each of our images typically has seeing in the 
range 0'/8 — l'.'3, and reaches la surface brightness fluctuations in 1 arcsec 2 apertures of ~ 29.1, 
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29.2, and 28.6 magnitudes/arcsec 2 in the U n , G, and 1Z bands, respectively. However, for objects 
fainter than 1Z ~ 25 small differences in the depth of the U n images can influence the surface 
density of the faintest LBG candidates because of the large dynamic range required to flag objects 
with significant continuum breaks. For this reason, we caution that the surface density of LBG 
candidates can have a small dependence on the quality of the images in a particular field. 

We are in the process of obtaining spectra of LBG candidates in each of these fields using 
the Keck telescopes; in the present work, we make use only of the redshift distribution for our 
spectroscopic survey as a whole. We defer to a future paper an analysis of the clustering in each 
field using the full redshift information (Adelberger et al. 1998, in preparation). 

Most of the fields chosen for our LBG survey are high latitude, low Galactic reddening fields 
that have been the subject of other faint galaxy studies; particularly relevant to our choice of fields 
was the existence of deep WFPC-2 images in the HST archive. 

The largest field is 1415+527, which is centered on a deep HST+ WFPC-2 pointing and also 
contains several of the pointings of the "Groth strip" . The field has also been studied by Connolly 
et al. (1997) for their photometric-redshift technique, and by the Canada- France redshift survey 
(Lilly et al. 1995). Our images were obtained using the prime focus camera on the Mayall 4-m 
telescope at Kitt Peak during 1996 May and cover an area of 15 x 15 arcmin 2 ; the 1Z image was 
supplemented with a mosaic of images (in order to cover the whole KPNO 4m field of view) 
obtained at the Palomar 5m Hale telescope with the COSMIC prime focus camera in 1997 March. 

The 2237+116 field (DSF2237) was chosen by us as region with low Galactic extinction and 
few bright stars which could be observed efficiently from both Palomar and Keck observatories 
in the late summer /early fall observing season. To our knowledge, no other faint galaxy studies 
have been conducted here. The imaging data were obtained in 1997 August with the Palomar 5m 
telescope; the total region studied consists of two abutting pointings, each 9' by 9' , aligned in the 
E-W direction. 

Like the 2237+116 field, the 2215+000 field consists of two COSMIC abutting pointings which 
have in this case been aligned along the North-South direction. The images were obtained in 1995 
August, 1996 August, and 1997 August. The northern pointing is centered on the "SSA22" field 
of Cowie et al. (1995) and overlaps somewhat with the region studied as part of the CFRS redshift 
survey "22 hour" field (Lilly et al. 1995). It also includes several moderately deep HST+ WFPC-2 
pointings (Cowie et al. 1996; Schade et al. 1995). In Paper 1 we have presented a preliminary 
analysis of LBG spectroscopy in this region. 

The 1234+625 (Hubble Deep Field, or HDF; Williams et al. 1996) and 0050+123 ("Caltech 
Deep Field", or CDF) fields were obtained as single pointings with COSMIC. The observations 
were carried out during March and April 1996 in the former and during October 1996 in the 
latter, respectively. The HDF has been extensively followed-up with Keck spectroscopy by several 
groups (Cowie et al. 1997; Cohen et al. 1996), including observations of Lyman-break galaxies 
identified with HST (Steidel et al. 1996b; Lowenthal et al. 1997). In the CDF, Cohen et al. (1996) 
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have obtained redshifts, primarily in the 0.3 < z < 1 range, as part of the "Caltech Deep Redshift 
Survey" ; this field was chosen by both Cohen et al. and by us because of the existence of a deep 
WFPC-2 "Medium Deep Survey" image at the center. For both the HDF and the CDF our images 
cover a region much larger than, but including, the HST pointings. 



3. THE MEASURE OF w{9) 

The angular correlation function w(9) is defined in terms of the excess probability over the 
random (Poisson) distribution of finding a companion in an angular shell of size dfl placed at an 
angular separation 9 from a selected galaxy, given the surface density of sources N (Peebles 1980): 



dP = M 



l + w(9) dn. (2) 



Usually w(9) is measured by comparing the observed number of galaxy pairs at a given 
separation 9 to the number of pairs of galaxies independently and uniformly distributed over 
the same geometry as the observed field. A number of statistical estimators of w(9) have been 
proposed (e.g. see Landy & Szalay 1993) in an attempt to minimize random and systematic errors. 

We considered the two estimators 

and 

_ DD(9)-2DR(9) + RR(9) 

W{9) ~ RR{9) ' { ) 

proposed by Peebles (1980) and Landy & Szalay (1993), respectively (PB and LS in the following), 
where DD{9) is the number of pairs of observed galaxies with angular separations in the range 
(9,9 + 59), RR(9) is the analogous quantity for the homogeneous (random) catalog, and DR(9) 
is the number of observed-random cross pairs. Both of these statistics produce estimates of 
w(9) which are biased low by a factor (the "integral constraint") / ~ 1 + O((#o/#max)^), where 
w(6q) = 1 (Peebles 1974), but (as we will see) 9q 9 mSLX and we will neglect this small correction. 
The properties of the two statistics are discussed by Landy & Szalay (1993). Relevant to our 
analysis is the fact that the variance of the LS estimator is smaller than the PB one, and is close 
to the variance of a Poisson distribution. In addition, the LS estimator should be less sensitive 
to edge effects and the presence of spurious variations of galaxy surface density (see below). We 
measured w(9) using both estimators to test for the presence of such systematics. 

We measured the correlation function using different techniques, as detailed below. The 
analysis was carried out using two independently written programs, finding virtually identical 
results. In each case we computed w(9) from each individual field and then computed a weighted 
average using inverse variance weighting; it made little difference if we used Poisson or bootstrap 
variance (Ling, Barrow & Frenk 1986). The error bars on the average correlation function are 
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the rms field-to-field variation in the estimated individual w(9) divided by V ^fields; Poisson and 
bootstrap errors have comparable size. Masking the regions around bright stars and galaxies 
where we could have not detected LBGs had a negligible effect on the results. 

We subsequently fitted the weighted average to the power law A^Q"^ with Levenberg- 
Marquardt nonlinear least-squares (Press et al. 1992). To estimate confidence intervals on the 
parameters A w and /3, we generated a large ensemble of random realizations (100,000) of the 
measured w(9), assuming normal errors, and calculated best fit parameter values for each of these 
synthetic data sets (e.g. Press et al. 1992 §15.6). We found that the fitted parameters depend 
somewhat on the choice of the binning used to compute w(9), and to take this additional source 
of uncertainties into account we have included the effects of a randomly variable binning into the 
Monte Carlo simulations. Table 2 lists the results. Because the fitted parameters are strongly 
covariant, the 68% confidence intervals are misleadingly large, particularly the one relative to A^; 
as we shall see, the comoving correlation length tq is much more tightly constrained than either A 
or (3 individually. 

Figure 2 shows the weighted average estimates of w(9) obtained from both the PB and LS 
estimators. For clarity, the error bars are plotted separately from the data points, in the upper 
part of the figure. A number of potential sources of systematic errors can affect the estimate of 
w(9) and we now describe the techniques that we have adopted to test for their presence. 

3.1. Spurious Sources 

Systematic contamination from spurious sources, i.e. sources physically unrelated to the 
Lyman-break galaxies, is relatively easy to take into account. Only ~ 75% of the spectroscopically 
observed Lyman-break candidates have been shown to be at z ~ 3. About 5% are stars and the 
remaining 20% have not been identified. The unidentified 20% have spectra and colors consistent 
with their being at z ~ 3, but in the worst case a fraction / ~ 0.25 of the photometrically selected 
objects could lie at redshifts outside of our primary selection window. If these objects were not 
clustered our estimate of w(9) would be low by a factor of 1/(1 — f) 2 ~ 1.56. We will refer to this 
as the case of "maximum contamination" . 

3.2. Field-to-Field Variations 

Particularly insidious are slight variations in detection probability across the chip due to 
structure in flatfields, optical aberrations, software performance, and so on. If uncorrected, the 
resulting density gradients can mimic galaxy clustering. We have tested for the presence of such 
effects by using the locations of non-LBG galaxies instead of a uniform (random) distribution 
when estimating DR and RR. Because these faint galaxies are intrinsically very weakly clustered, 
more than 10 x less clustered than the LBGs at 1Z ~ 25.5 (e.g. Brainerd, Smail, & Mould 1995), 
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they provide a reasonable approximation to a random distribution, and they have the advantage 
of being subject to similar angular variations in detection probability as our LBG sample. As can 
be seen in Table 2, the difference between measures of w(9) (from the same estimator) obtained 
using the random distribution or the cross-correlation technique described above is comparable to 
the random error and does not show any systematic trend, suggesting that the effect discussed 
here is not important at the current level of precision of the data. 

Variations of sensitivity or in the photometric calibration (particularly in the U n band) across 
individual fields are another possible source of spurious clustering signal. This is because the 
density of objects with U n — G and G — 1Z colors close to edge of the color selection "window" 
is relatively high, and small variations in the photometry can result in a significant number of 
galaxies being excluded from or included in the sample. Small spatial variations in sensitivity or 
color are most likely to have the same dependence as the 1Z detection probability, as they would 
be due to small amounts of vignetting and/or variation in image quality that would affect all 
bandpasses in a similar fashion, to first order. The quality of the CCDs used to obtain the images 
is high enough that spatial variation in quantum efficiency, even in the U n band, are small enough 
so as to be negligible in this context. Moreover, our photometric selection criteria are conservative 
enough that any objects scattering into or out of our selection window would also be bona fide 
LBGs, and their redshift distribution would not be different enough to have a significant effect on 
the clustering properties. That is, small spatial variations in the color zero point could have only 
a second-order effect on the clustering properties, given that the transformation of the angular 
into the spatial correlation function depends relatively weakly on the redshift distribution. 

3.3. Field Geometry 

Two of our fields, SSA22 and DSF2237, were each constructed from two abutting CCD frames 
aligned along one direction in order to produce composite samples covering larger area than that 
of the individual pointings. In general, whether it is advantageous to estimate w(9) from one large 
field with N galaxies or from M subfields with N/M galaxies and then average the results depends 
on the noise characteristics of the adopted estimator (see Landy & Szalay 1993 for a discussion). 
If the noise scales as 1/N, like in the case of the PB estimator, then it would make no difference. 
Analyzing the abutting fields separately instead of together increases the total noise by a factor of 
M in the case of the LS estimator, because its variance scales as 1/N 2 . Furthermore, in both cases 
an additional error is also introduced by the 1/M loss in the total pairs. Thus, the combination 
of composite fields and the LS statistics offers the possibility of extracting w(6) from the data 
maximizing the S/N, which is useful in cases like ours where the samples are still relatively small 
and the clustering signal rather weak. 

Unfortunately, fluctuations of the apparent galaxy surface density from field to field due to 
images of differing depths, galactic extinction and reddening, and so on, can introduce an artificial 
clustering signal in the observed w(9) and cancel the benefits of using composite fields. Although 
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a relatively large overlapping region between the two individual images that compose these fields 
allowed us to check the consistency of the photometry and selection criteria between them, the 
possibility that the two sub-fields have slightly different depth is difficult to test. Therefore, we 
have studied both cases of split and composite fields with both the PB and LS statistics, each 
time using both the random and cross-correlation techniques. The results are listed in Table 2. 
It can be seen that, in every case, w(9) in the case of composite fields is shallower and with a 
larger correlation amplitude than that from the split fields, whose parameters also have larger 
random errors. Although the difference is comparable to the 1-cr error bars, making the results 
still consistent with each other, there is a possible presence of a systematic error from this effect. 



3.4. The PB estimator vs the LS estimator 

Table 2 and Figure 2 show that the measures of w{9) obtained using the PB estimator are 
systematically larger than those from the LS one, with the fractional difference between the 
two statistics being the largest at separations of the order of ~ 100 arcsec. As a result, the 
correlation function obtained using the PB statistics has a shallower slope and larger correlation 
amplitude, and the random errors on the fitted parameters are smaller. Again, such differences are 
comparable to the 1-cr error bars and, formally, the parameters derived from the two estimators 
agree with each other. However, it is clear that the difference is systematic. It is beyond the scope 
of this paper to analyze the relative merits of the two estimators, and here we simply report all 
the results, cautioning that the discrepancy that we found, although comparable to the random 
errors, may imply that the two statistics are sensitive to the presence of low-level systematics in 
different way. 



4. THE INVERSION OF THE ANGULAR FUNCTION 



The angular correlation function w{9) can be obtained from the spatial one, £(r), through the 
Limber transform if the galaxies' redshift distribution dN/dz is known (Peebles 1980; Efstathiou 
et al. 1991). If the spatial function can be modeled as 



£(r) = (r/roH x /(*), 



(5) 



where f(z) describes its redshift dependence, the angular function has the form w(9) = A w 9 @, 
where (3 = 7 — 1 and 



dN 



A w =Crl j* S f{z)D l -\z)[^g\ <i~ 1 ( : ) <7: x 



z f (dN 
, \ dz 



1 -2 



dx 



(Efstathiou et al. 1991). Here Dq(z) is the angular diameter distance, 

g ( z ) = J- [{ l +z )\l + {l Q z)^]-\ 
no 



(6) 



(7) 



and C is a numerical factor given by 



C = v 7 ^ 



r[(7-i)/2] 
r(7/2) 



(8) 



The Lyman-break galaxies' redshift distribution is considerably narrower than those of 
traditional flux-limited redshift surveys and spans a redshift interval of only Az ~ 0.8 (the 
corresponding cosmic-time interval is « 0.35 hr x (0.26 /t, -1 ) Gyr for go = 0.1 (go = 0.5). In this 
redshift range and at the spatial scales considered here (larger than a few Mpc), the evolution 
of cosmic structure is largely driven by the growing mode of linear perturbations Bit). The 
front-to-end fractional variation of D(t) in the above redshift range is ~ 13 (18%), which is an 
upper limit to the effective variation of the correlation length in our sample because of the peaked 
redshift distribution. It is reasonable, therefore, to expect little evolution of LBG clustering in our 
sample over so short a time. In this case the function f(z) above can be taken out of the integral 
and the quantity tq(z) = ro f(z) is then the correlation length at the epoch of observations. 

We list in Table 3 the comoving correlation lengths r$(z) at z = 3.04 obtained through Eqn. 
(6) using the average w(9) and the redshift distribution N(z) plotted in Figure 1. The 68% 
confidence intervals were computed using Monte Carlo simulations. As mentioned above, the 
correlation length turns out to be much more tightly constrained than the individual parameters 
of the angular correlation function, with a typical fractional error of ~ 20-30% at the l-cr level. 
This is enough precision to show the effects of the systematic differences between the values of ro 
obtained from the PB and LS statistics and between keeping the two adjacent pointings in the 
SSA22 and DSF2237 fields as independent or joining them into two composite fields, respectively. 
Figures 3a and 3b shows the distribution of values of ro obtained combining together the measures 
of ro from the Monte Carlo simulations for the PB (thin continuous histogram) and LS estimators 
(broken histogram) respectively, for the two values of qo considered in the paper. We defer a deeper 
analysis of the systematics to future papers. At this time, because the differences between all the 
various measures of ro that we have obtained are still comparable to the l-cr random errors, we list 
them all. As our fiducial measure and error bar we adopt the median of the distribution obtained 
by merging the PB and LS Monte Carlo distributions and its corresponding 68% confidence 
interval. These are ro = 3.3^q 6 and ro = 2.1]to' 5 h~ l Mpc for c/o = 0.1 and c/o = 0.5, respectively, 
and the histograms of these distributions are plotted in Figure 3 as thick continuous lines. We 
have used these values to produce the plots in Figure 4. Our fiducial value of (3, obtained in a 
similar fashion is (3 = 0.98^Q2g and the histograms of the Monte Carlo simulations for the PB, LS 
and combined samples are plotted in Figure 3c. Finally, we remind that all values of correlation 
length would be approximately 25% higher in the case of "maximum contamination." 

We can estimate the bias of these galaxies by comparing their correlation function £ 9 to the 
correlation function of the mass £ m : 




(9) 



- 11 - 



Although (as eq. 9 shows) the bias is in principle a function of scale, our constraint on the 
power-law exponent 7 = 1 — (3 is relatively weak (see Table 3), and we can only estimate a 
"typical" value of the bias over the scales of a few Mpc which are probed here. In practice, we use 
the ratio of the correlation length of the LBGs to that of £ m (r) predicted by the CDM theory to 
compute the bias, which is therefore relative to r = 1 fo _1 Mpc. Using a CDM power-spectrum 
with shape parameter T* = 0.25, claimed to fit the shape of the local large-scale structure very 
well (Peacock 1997), and normalization of Eke, Cole, & Frenk (1996), we estimate b ~ 1.5 (4.5) 
for qo = 0.1 (0.5). Choosing T* = 0.20 results in b ~ 1.5 (5), while adopting the normalization of 
White, Efstathiou & Frenk (1993) results in b ~ 1 (4). 

Overall, these bias values are slightly lower than those estimated in Paper 1, but are consistent 
with them at the ~ 10% confidence level. 

5. DISCUSSION 

The high efficiency of the Lyman-break technique and the relatively narrow range of 
redshifts/cosmic time that it probes make angular clustering a particularly economic means to 
study large-scale structure at high redshifts, once the redshift distribution of the galaxy candidates 
has been measured. Not only is one free from securing a complete spectroscopic follow-up 
of the candidates, but the systematics due to selection effects are easier to handle than than 
those that affect studies of spatial clustering using the full redshift information. Our two main 
conclusions from an analysis of the angular clustering of Lyman-break galaxies in the redshift 
range 2.5 ^ z ^ 3.4 are that these systems are strongly clustered and that their correlation 
function has a slope which is as steep or steeper than that of local galaxies. 

In this section, we discuss the implications of these conclusions in turn. 

5.1. The Slope of The Correlation Function 

The correlation function of the Lyman-break galaxies has a slope that is comparable to or 
steeper than that measured at intermediate and low redshifts. Table 2 shows the values of (3 
obtained from the various techniques discussed above for each of the PB and LS estimators, 
respectively. Combining all the PB Monte Carlo distributions together, we found (3pb = 0.80lojg, 
where the error bars are the 68% confidence interval. Within the errors, this is the same value 
found for field galaxies in flux selected surveys. The LS estimator returns a slightly steeper 
correlation function, with (3ls = l-l^lo^l, comparable to that of the earliest (E/S0) and/or most 
luminous local galaxies (e.g. Loveday et al. 1995). Combining all the Monte Carlo distributions 
together we found p = 0.98lg;i|. The distribution of the PB slope rules out (3 = 0.25 or shallower 
at the 99.9% confidence level, while the distribution of the LS slopes rules out (3 = 0.49 or 
shallower at the same confidence level. 
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The evolution of the slope of the correlation function of the mass (or, equivalently, that 
of the power spectrum at small scales) has a pronounced dependence on f2. For a CDM-like 
power spectrum, it depends very weakly on the shape parameter T* and, for flat models, on the 
normalization. As Eqn.(9) shows, the slope of S,g( r ) differs from that of £ m (r) because of the 
dependence of the bias parameter b(r) with the spatial scale. The form of b(r), its dependence 
on galaxy properties and how it evolves with redshift are still subjects of discussion (e.g. Mann, 
Peacock & Heavens 1997; Bagla 1997). If the scale dependence of b(r) for the LBGs over the 
spatial scales probed by our correlation analysis, namely 1 rS r ^ 10 fo _1 Mpc, is similar to that 
of the local galaxies, then our measures of 7 = (3 + 1 are inconsistent with £ m (r) from the CDM 
theory if = 1. With our choice of T* = 0.25 we found j m = 1.25 (over the range 1 < r < 10 
ft^Mpc), independently of the normalization. As mentioned above, the dependence on T* is very 
weak. If T* = 0.1 then j rn = 0.98, while if T* = 0.6, then 7™ = 1.14. Using the slope measured 
from the PB estimator, the steepest CDM slope (■y m = 1.25) is ruled out by our data at the 
99.90% confidence level. Using the measures from the LS estimator, it is ruled out at the 99.991% 
confidence level. Open CDM models with the same parameters as above produce slopes in the 
range 1.6 < 7 m < 2.1 (in open models the slope of £ m (r) has a more pronounced dependence on 
the normalization), which are all consistent with our data. 

The above computations assume a bias constant with spatial scale. We stress, however, 
that the evolution of the slope of the correlation function is useful for constraining cosmological 
models only if the dependence of the bias with the spatial scale and its evolution with redshift are 
known. The function b(r) also depends on the properties of the halos, which further complicates 
the interpretation of the data because of the difficulty of establishing an evolutionary sequence 
between the systems observed at high redshifts and the local galaxies. Bagla's (1997) N-body 
simulations seem to suggest that the bias will not be strongly scale-dependent — his b(r) for 
M > 2 x 10 12 M Q halos at z = in standard CDM has a power-law slope of only ~ —0.18 — and 
if b(r) for Lyman-break galaxies is similarly flat, our conclusions about the slope would not be 
importantly changed. But until more is known about the scale-dependence of the bias they will 
remain speculative. 

5.2. Spatial Clustering and its "Evolution" 

The LBGs at z ~ 3 are characterized by strong spatial clustering, with a co-moving correlation 
length of ~ 3 — 4/i _1 Mpc for a low matter density (Qq = 0.2) Universe. The correlation length 
would be ~ 25% higher if the "maximum contamination" applies. This is comparable to the 
clustering of present-day spiral and IRAS galaxies, and a factor of « 2 smaller than that of 
present-day ellipticals. 

A simple comparison of the observed clustering properties with the expected clustering of a 
suitably normalized CDM density field, as shown in Figure 3b, suggests that, in the context of such 
models, the LBGs must be substantially biased with respect to the dark matter, with higher bias 
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required in models with higher matter density. This result is in qualitative agreement with our 
conclusions in Paper 1 on the basis of the clustering in redshift space in one of our survey fields, 
although the redshift-space analysis may imply somewhat higher bias (on slightly different scales) 
than the correlation function analysis presented here. As discussed there, the strong clustering and 
the large bias of the Lyman-break galaxies are consistent with biased galaxy formation theories 
and provide additional evidence that these systems are associated with massive dark matter halos. 

One might be tempted to try to fit the Lyman break galaxy clustering properties onto an 
general evolutionary sequence for galaxy clustering versus cosmic epoch. Figure 4a shows the 
galaxy clustering strength measured from various redshift surveys, including the LBGs, as a 
function of redshift. To quantify the clustering strength we have used the function rj x (1 + z) 3 , 
where now vq is in proper coordinates. Figure 4b shows the corresponding plot of the bias derived 
using the CDM mass correlation function, which we have computed using the non-linear code for 
the evolution of the power spectrum by Peacock (1997). We used the shape parameter T* = 0.25 
and normalized the power spectrum to as = 1.0 for the open model and as = 0.5 in the Einstein-de 
Sitter case. The error bars have been computed by propagating the errors in the measure of the 
correlation length, but it is clear that the dominant uncertainty is in the choice of normalization 
of the theoretical curve. 

As Figure 4 suggests, the variations in the value of the "effective" bias from sample to 
sample complicate the interpretation of the apparent evolution of galaxy clustering as due to 
the gravitational growth of structures, preventing us from deriving information on the clustering 
evolution of the mass. For example, the data show that the traditional power-law model 
£(r, z) = £o( r ) x (1 + z)~( 3+t ^ used to describe the gravitational evolution of clustering (Peebles 
1980) is not a good representation for any value of e over the redshift interval z 3. In fact, it 
is clear from N-body simulations (e.g., Brainerd & Villumsen 1993, Bagla 1997) that the behavior 
of the clustering of halos, as opposed to that of the overall mass, will have a non-trivial dependence 
on redshift that is not accounted for in the "e" models. When one also notes the uncertainties 
in the way in which dark halos are effectively sampled in any given redshift survey, we question 
the ultimate usefulness of fitting values of this parameter to the results of redshift surveys over 
substantial redshift baselines. 

Lyman-break galaxies at z ~ 3 represent both a large jump in redshift and a substantially 
different detection/selection technique than has been used previously in most of galaxy clustering 
analyses. It is highly likely that a similar selection criterion applied to nearby galaxies would 
result in a very different correlation function than for (nearby) optically-selected galaxies. If star 
formation progresses to less massive systems with time (e.g., Cowie et al. 1997, Giavalisco et 
al. 1996), then a Lyman-break galaxy sample over a large range of redshifts would likely exhibit 
a gradually diminishing clustering strength with time, rather than a monotonically increasing 
clustering strength due to growth of the overall mass fluctuations due to gravitational instability. 
Our results emphasize that the interpretation of apparent "evolution" of the clustering properties 
of galaxies in deep surveys (e.g., Efstathiou 1995; Brainerd et al. 1995, etc.) cannot be directly 
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interpreted as a reliable measure of the overall growth of structure. In a typical survey limited by 
apparent magnitude, the mix of galaxy types and the relative sensitivity of the survey to stellar 
mass, star formation rate, and bolometric luminosity (and the complicated manner in which these 
quantities are related to overall galaxy mass) will be strongly redshift-dependent. It is apparent 
now, if it has not always been, that the clustering properties of galaxies at moderate to high 
redshift cannot be used as a cosmological tool unless one is prepared to simultaneously understand 
where galaxies form and how they evolve relative to the underlying distribution of dark matter 
— cosmology and galaxy formation cannot be understood independently in this context. 

These problems argue strongly in favor of a focused approach to studies of large-scale structure 
over substantial redshift baselines, and call into question the very use of the phrase "evolution of 
galaxy clustering". The results found in this paper reiterate that selecting a particular class of 
objects as defined by their observational properties, over a relatively narrow range of cosmic time, 
may offer the best hope of constructing meaningful samples for clustering analyses that can be 
compared with theoretical predictions (e.g. Le Fevre et al. 1996; Carlberg et al. 1997). At very 
high redshifts the specific mechanisms of galaxy formation are expected to affect the clustering of 
forming systems the most. In the case of the z ~ 3 LBGs, the interval of cosmic time spanned 
by such samples are small enough, and the selection criterion uniform enough, that even if it 
may be quite model-dependent to "map" them onto samples of galaxies selected differently at 
other redshifts, one might at least be confident of measuring something specific that can be easily 
compared to the predictions of models or simulations (e.g. Adelberger et al. 1998, Giavalisco et 
al. 1998b, in prep.). 

Because the LBG selection technique (or other equivalent to it) is sensitive to those 
galaxies that are the most actively star- forming systems (at whatever redshift/epoch probed), 
understanding the evolution of the clustering with redshift would involve not only the modeling 
of the underlying dark matter distribution, which is now relatively straight-forward using N-body 
simulations, but also a detailed understanding of which types of objects (i.e., which "halos") would 
be harboring star formation at detectable levels as a function of time. Simulations that include 
star formation (e.g., Baugh et al. 1997, Weinberg et al. 1997) can make direct predictions of the 
clustering properties of objects subject to a star formation rate threshold, based on semi-analytic 
treatment of star formation within dark matter halos. The observed clustering strength of LBGs 
is close to predictions of these models (particularly after possible corrections for a small amount of 
contamination in the photometrically selected LBG sample), and similar numbers result naturally 
from pure N-body or analytic models that assume a mapping of the most massive virialized halos 
at z ~ 3 to objects likely to be visible because of their star formation (Bagla 1997, Mo & Fukugita 
1996, Jing & Suto 1997). 

Our current measurements can be improved upon in a number of ways. First, obtaining 
photometric identifications of LBGs over much larger fields would result in much better 
independent constraints on the amplitude and slope of the angular correlation function. Secondly, 
a full real-space analysis of the clustering properties of the LBGs in the same fields will offer a 
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much higher signal-to-noise ratio (and a different dependency on cosmological parameters) than 
can be attained from the angular distribution of candidates coupled with an empirical redshift 
selection function. The real-space analysis requires a more careful treatment of observational 
selection effects, and reasonably complete spectroscopy, and we have deferred this to a future 
paper (Adelberger et al. 1998). 

6. SUMMARY 

We have measured the angular correlation function w(9) of Lyman-break galaxies at redshift 
z ~ 3. Fitting the power-law A w 9~@ to a weighted average of w(9) from the five fields over the 
range 12 ^ 9 ^ 330 arcsec, we find A w ~ 2 arcsec^ 3 and (3 ~ 0.9. The slope is, within the errors, the 
same as for galaxy samples in the local and intermediate redshift universe, and a slope (3 = 0.25 
or shallower is ruled out by the data at the 99.9% confidence level. 

Because the redshift distribution N(z) of LBGs is well determined from 376 spectroscopic 
redshifts, we have derived the real-space correlation function from the angular one through the 
Limber transform. The inversion of the w{9) is rather insensitive to the still relatively large 
uncertainties on and (3, and the spatial correlation length tq is much more tightly constrained 
than these. Using Monte Carlo simulations to derive the la error bars from the 68% confidence 
interval, we estimate vq = 3. 3^6 (2-llas) ^ -1 Mpc (comoving) for go = 0.1 (0.5) at the median 
redshift of the survey, z = 3.04. Thus, the observed comoving correlation length of LBGs at 
z ~ 3 is comparable to that of present-day spiral galaxies and is only ~ 50% smaller than that of 
present-day ellipticals; it is as large or larger than any measured in recent intermediate-redshift 
galaxy samples (0.3 ^ z ^ 1). 

By comparing the observed correlation length of LBGs to that of the mass predicted from 
CDM theory, we have estimated a linear bias for LBGs of b ~ 1.5 (4.5) for go = 0.1 (0.5), in broad 
agreement with our previous estimates based on preliminary spectroscopy (Paper 1). The strong 
clustering and the large inferred bias of the LBGs are consistent with biased galaxy formation 
theories and provide additional evidence that these systems are associated with massive dark 
matter halos. 

The evolution of the slope of the correlation function of the mass (or, equivalently, that of 
the power spectrum at small scales) has a pronounced dependence on f2. If the biasing parameter 
is a weak function of the spatial scale, the measured slope of the correlation function of LBGs, 
7 = 1.98ioi2i> i s inconsistent with the predictions of the standard CDM theory with f2 = 1 at the 
99.9% confidence level. N-body simulations seem to suggest that the bias will not be strongly 
scale-dependent; however, until more is known about the scale-dependence of the bias, this 
conclusion will remain speculative. 

The results of the clustering of LBGs at z ~ 3 emphasize that apparent evolution in the 
clustering properties of galaxies may be due as much to variations in effective light-to-mass 
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bias parameter among different galaxy samples as to evolution in the mass distribution 
through gravitational instability. Our study shows that the traditional power-law model 
£(r, z) = £o( r ) x (1 + z)~( 3+t ^ traditionally used to describe the gravitational evolution of clustering 
(Peebles 1980) is not a good representation for any value of e over the redshift interval z ^ 3. 
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Table 1. The Observed Fields 



# 


F 


ield 


Size a 


N b 




a d 


1 


0050+123 


(CDF) 


8.8 x 8.9 


80 


1.02 


59.0 


2 


1234+625 


(HDF) 


8.6 x 8.7 


104 


1.39 


50.9 


3 


1415+527 


(Westphal) 


15.1 x 15.1 


293 


1.29 


52.9 


4 


2215+000 


(SSA22) 


8.6 x 17.6 


186 


1.23 


54.1 


4a 


2215+000 


(SSA22a) 


8.6 x 8.9 


87 


1.14 


56.3 


4b 


2215+000 


(SSA22b) 


8.6 x 9.0 


99 


1.28 


53.1 


5 


2237+114 


(DSF2237) 


17.4 x 10.1 


208 


1.18 


55.2 


5a 


2237+114 


(DSF2237a) 


9.2 x 10.1 


86 


0.93 


62.4 


5b 


2237+114 


(DSF2237b) 


9.0 x 10.1 


127 


1.40 


50.8 



a In units of arcmin 2 . 

b Number of LBG candidates with V, < 25.5. 
c Surfacc density at TL < 25.5; galaxies per arcmin 2 . 
d Mean intergalaxy angular separation at TZ < 25.5; arcsec. 



Table 2. The Fitted Parameters 



Estimator A„ a 0° 



PB rand 


1 3+ 12 


0. 


7+0.1 
'-0.1 


PB rand-split 


o i+3.1 
z - 1 -1.2 





q+02 


PB xcor 


1 0+ 1 ' 2 





7+0.2 
' -0.2 


PB xcor-split 


2 3+ 4 ' 7 


1 


q+0.3 


PB all 




0. 


O+0.3 
8 -0.2 


LS rand 




1. 


i+0.3 
L -0.2 


LS rand-split 


4 1+10-0 
^• 1 -2.8 


1. 


9+0.3 
z -0.3 


LS xcor 


O o + 5.1 

o.o_2 2 


1. 


1+0.2 
^-0.2 


LS xcor-split 


fi 9+12.3 


1. 


O+0.3 
°-0.2 


LS all 


4 1+7-8 


1. 


1+0.3 
-0.2 


PB + LS all 


2.5±H 


1. 


q+0.3 
U -0.3 



a Angular correlation amplitude in 

arcsec' 3 

b Angular correlation slope 
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Table 3. The Correlation Length a 



ro ro 



qo = 0.1 


PB 


LS 


rand 


3 9+ ' 4 


2 9+ ' 4 


rand-splt 


q c+0.5 


9 7 +0.5 
z -'-0.6 


xcor 


q - + 0.5 


q q + 0.5 


xcor-splt 


q 1+0.5 


9 O+0.5 
2 - 8 -0.5 


all 


q fi + 0.5 


2 q+0.5 



go = 0.5 








rand 


2 5+ ' 3 


l.S 


+0.3 
'-0.3 


rand-splt 


o q+0.2 


1.7 


■+0.3 
-0.3 


xcor 


o 4+0-3 
z -^-0.4 


2.1 


+0.3 
-0.3 


xcor-splt 


2 q+0.3 
Z - U -0.3 


1.8 


+0.3 
'-0.3 


all 


9 q+0.3 


1.9 


,+0.3 
-0.4 



a Comoving coordinates, in 
units of h^ 1 Mpc 
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Lyman-Break Galaxies, «^<25.5 
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Fig. 1. — The redshift distribution function N(z) of the 376 LBGs used in the correlation analysis. 
The bin size is Az = 0.2. The interval 2.6 £ z 3.4 contains 90% of the galaxies. 
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Lyman — Break Galaxies, ^<25.5 
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Fig. 2. — Weighted average angular correlation function of LBGs. The filled points are from the 
PB estimator, the open points from the LS one. The error bars are shown on the top of the figure. 
The continuous line is the best-fit power law to the PB data points, the dotted line is the fit to the 
LS. The thick horizontal continuous segment on the x axis marks the angular range over which we 
computed the fits. 
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Fig. 3. — The histogram of the correlation length ro (a) and b)) and of the slope [5 (c)) from 
the Monte Carlo simulations. The thin continuous line is for the PB estimator, the broken line is 
for the LS estimator. For each estimator, the Monte Carlo distributions corresponding to the four 
different measures listed in Table 3 have been merged together. The thick continuous line is the 
distribution of all the PB and LS Monte Carlo samples merged together. 
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Fig. 4. — a) The strength of galaxy clustering as a function of redshift. Filled symbols are for 
qo = 0.1, open symbols for qo = 0.5. Triangles are APM data (Loveday et al. 1995), squares are 
CFRS data (Le Fevre et al. 1996), hexagons Keck K-band data (Carlberg et al. 1997), and circles 
are the LBG data. The continuous solid and dashed lines are the expectations from the CDM 
theory with T* = 0.25, as = 1.0 and 0.5 for the two case of qo = 0.1 and 0.5, respectively, b) 
Linear bias as a function of redshifts for the same data sets as a), assuming the CDM correlation 
function. Symbols are defined as above. 



